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Abstract 

The aim of this paper is to prove the achievabihty of fixed-rate universal coding problems by using our 
previously introduced notion of hash property. These problems are the fixed-rate lossless universal source coding 
problem and the fixed-rate universal channel coding problem. Since an ensemble of sparse matrices satisfies the 
hash property requirement, it is proved that we can construct universal codes by using sparse matrices. 

Index Terms 

channel coding, fixed-rate universal codes hash functions, linear codes, lossless source coding, minimum- 
divergence encoding, minimum-entropy decoding, shannon theory, sparse matrix 

I. Introduction 

The notion of hash property is introduced in [12]. It is a sufficient condition for the achievability of coding 
theorems including lossless and lossy source coding, channel coding, the Slepian-Wolf problem, the Wyner-Ziv 
problem, the Gel'fand-Pinsker problem, and the problem of source coding with partial side information at the 
decoder. Since an ensemble of sparse matrices satisfies the hash property requirement, it is proved that we can 
construct codes by using sparse matrices and maximum-likelihood coding. 

However, it is assumed in [12] that source and channel distributions are used when designing a code. The 
aim of this paper is to prove fixed-rate universal coding theorems based on the hash property, where a specific 
probability distribution is not assumed for the design of a code and the error probability of a code vanishes for 
all sources specified by the encoding rate. 

We prove theorems of fixed-rate lossless universal source coding (see Fig. 1) and fixed-rate universal channel 
coding (see Fig. 2). In the construction of codes, the maximum-likelihood coding used in [12] is replaced by 
a minimum-divergence encoder and a minimum-entropy decoder. A practical algorithm has been obtained for 
the minimum-entropy decoder by using linear programming [2]. It should be noted that a practical algorithm 
for the minimum-divergence encoder can also be obtained by using linear programming as shown in Section V. 
The fixed-rate lossless universal source coding theorem is proved in [3] for the ensemble of all linear matrices 
in the context of the Slepian-Wolf source coding problem, in [7] for the class of universal hash functions, and 
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Fig. 1. Lossless Source Coding 
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Fig. 2. Channel Coding 



in [11] implicitly for an ensemble of sparse matrices in the context of a secret key agreement from correlated 
source outputs. The universal channel coding theorem that employs sparse matrices is proved in [8] for an 
additive noise channel and in [9] for an arbitrary channel. It should be noted here that the linearity for an 
ensemble member is not assumed in our proof. Our proof assumes that ensembles of sparse matrices have a 
hash property and so is simpler than previously reported proofs [11][8][9]. 

II. Definitions and Notations 
Throughout this paper, we use the following definitions and notations. 

Column vectors and sequences are denoted in boldface. Let Au denote a value taken by a function A : 
I/" — > at u e ZY" where U"' is a domain of the function. It should be noted that A may be non-hnear. For 
a function A and a set of functions A, let Im^ and ImA be defined as 

ImA = {Au -.ueU"-} 
ImA = y ImA. 

The cardinahty of a set U is denoted by \U\ and U — {u} is a set difference. We define sets Ca{c) and 
CAB{c,m) as 

Ca{c) = {u : Au = c} 
Cab{c, m) = {u : Au = c, Bu = m}. 

In the context of hnear codes, Ca{c) is called a coset determined by c. 

Let p and p' be probability distributions and let q and q' be conditional probabihty distributions. Then entropy 
H{p), conditional entropy H{q\p), divergence £>(p||p'), and conditional divergence D{q\^q'\p) are defined as 

H{p) = Y,p{u)\og^ 

U 

H{q\p) = V q{u\v)p{v) log — ^ 
^-^ q(u\v) 
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D{q II q'\v) ^ E?'(^)E«(«l^)log57S' 

V U -1 \ I / 

where we assume the base 2 of the logarithm. 

Let fiuv be the joint probabihty distribution of random variables U and V. Let fjbu and /xy be the respective 
marginal distributions and Hu\v be the conditional probabihty distribution. Then the entropy H{U), the 
conditional entropy H{U\V), and the mutual information I{U ; V) of random variables are defined as 

H{U) = H{p.u) 
H{U\V) = H{nu\vW) 
I{U; V) = H{iJLu) + H{fiv) - Hifiuv). 

Let fu and be defined as 

. s \{l<i<n:Ui = u}\ 
n 

v^\^{u\v) = -— . 

We call a type ' of m e Z^" and v^\^ a conditional type. Let U = vjj the type of a sequence and 
\J\y = vu\v be the conditional type of a sequence given a sequence of type U . Then a set of typical sequences 
Tu and a set of conditionally typical sequences Tu\^y{v) are defined as 

Tu = {u:vu = vu] 

Tu\v{v) = [u : Vu\v = vu\v} , 

respectively. The empirical entropy, the empirical conditional entropy, and empirical mutual information are 
defined as 

H{u) = H{vu) 
H{u\v) = H{u^\^\i/v) 
I{u; v) = H{vu) + H{v^) - Hivuv)- 
In the construction of a universal source code, we use a minimum-entropy decoder 

gA{c) = arg min H{x') 

It should be noted that the linear programing technique introduced in [2] can be apphed to the minimum-entropy 
decoder gA- In the construction of a universal channel code, we use a minimum-divergence encoder 



gAB{c, m) = a.Tg min D{iyx<\\nx) 

x'eCABic,m) 



and a minimum-entropy decoder 



9a{c, y) = arg min H{x'\y). 

x'<^Ca{c) 



'in [12], the type of a sequence is defined as a histogram {nvu{u)}ueu- 
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It should be noted that we have 



QAB {c,m) = arg max [log ij,x{x') + nH{u^>)] 



arg max 

U' 



nH{U') + max log iJ,x{x') 

x'eCAB{c,m)nTy, 



from Lemma 7. When functions A and B are linear, the linear programing technieque introduced in [6] can be 
applied to the maximization maxj./ fxx {x') because U' is fixed and the constraint condition x' G Cab (c, m) fl 
7[// is represented by linear functions. 



Finally, we define x(') as 



X{a = b) = 



We define a sequence {Xu{n)}^=i as 



Xia ^ b) 



Xu{n) = 



1, if o = 6 

0, if a 7^ 6 

1, if a ^ 6 
0, if a = 6. 



m log[n + 1] 
n 



(1) 



It should be noted here that the product setU xV is denoted by UV when it appears in the subscript of this 



function and we omit argument n of Xu when n is clear in the context. We define 

9, if 6* > 0, 
0, if 6* < 0. 



as 



(2) 



III. (a, /3)-HASH Property 

In this section, we reveiw the notion of the (a,/3)-hash property introduced in [12]. This is a sufficient 
condition for coding theorems, where the Unearity of functions is not assumed. By using this notion, we prove 
a fixed-rate universal source coding theorem and a fixed-rate universal source coding theorem. 

Throughout the paper, Au denotes a value taken by a function AatuGU" where U"' is the domain of the 
function. It should again be noted here that A may be non-Unear. We define the (a, /3)-hash property in the 
following. 

Definition 1: Let A he a set of functions A:U" and we assume that ImA = 1mA for all A G A and 

log^ 



lim 

n— >oo 



llm^l 



0. 



(HI) 



Let Pa be a probabiUty distribution on A. We call a pair {A, pa) an ensemble. Then, [A, pa) has an (a, fi)-hash 
property if a = {a(n)}^i and (3 = {/3(n)}^i satisfy 



lim a(n) = 1 

n— >oo 

lim /3(n) = 



(H2) 
(H3) 
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and 

J2 p{{A:Au = Au'})<\Tnr\+^^^^^^+mm{\T\,\r\}P{n) (H4) 

for any T, T' C U". Throughout this paper, we omit argument n of a and fi when n is fixed. ■ 

In the following, we present two examples of ensembles that have a hash property. 
Example 1: In this example, we consider a universal class of hash functions introduced in [5]. A set A of 
functions A : U"' ^ U is called a universal class of hash functions if 

\{A:Au = Au'} I < ^ 

for any u ^ u'. For example, the set of all functions on ZY" and the set of all linear functions A : U''^ 
are universal classes of hash functions (see [5]). 

It should be noted that every example above satisfies Im^ = U. When >l is a universal class of hash functions 
and Pa is the uniform probability on A, we have 

^ PA {{A : Au = Au'}) < |T n T'l + ffi^. 

u'eT' 

This implies that {A^pa) has a (1, 0)-hash property, where a{n) = 1 and P{n) = for every n. ■ 
Example 2: In this example, we revew the ensemble of q-axy sparse matrices introduced in [12]. In the following, 
let U = GF{q) and I a = nR. We generate an / x n matrix A with the following procedure: 

1) Start from an all-zero matrix. 

2) For each i E {1, . . . , n}, repeat the following procedure t times: 

a) Choose {j, a) e {1, . . . , Ia} x [GF(g) — {0}] uniformly at random. 

b) Add a to the {j, i) component of A. 

Let {A, Pa) be an ensemble corresponding to the above procedure. Then 



ImA 



, u has an even number of I 
u&W : ) , if q 

non-zero elements 



U\ if g > 2 

for all j4 e >l and there is {cxa-iI^a) such that {A,pa) has an (aA,/3A)"hash property (see [12, Theorem 2]). 
■ 

In the following, Let A (resp. B) be a set of functions A : U"' — > Ua (resp. B : — > Ub)- We assume 
that an ensemble {A,pa) has an (a^,;9^)-hash property and an ensemble {A x B,pa x Pb) also has an 
(a^B,/3^3)-hash property. We also assume that pc and pu is the uniform distribution on ImA and ImB, 
respectively, and random variables A, B, C, and M are mutually independent, that is. 



pc{c) = < 



PM{m) = < 



ifceW 
0, if c e W - Im^ 

ifmelmB 
0, if meU - ImA 
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Fig. 3. Construction of Fixed-rate Source Code 



Pabcm{A, B, c, m) = pA{A)pB{B)pc{c)pM{m) 

for any A, B, and c. We use the following lemmas, which are shown in [12]. 
Lemma 1 ([12, Lemma 9]): For any A and u € U"", 

PC ({c : Au = c}) = Y,Pc{c)x{Au = c) = 

and for any u £ U"', 

Eac [xiAu = c)] = Y,PAc{A,c)x{Au = c) = 
Lemma 2 ([12, Lemma 2]): If G C U"' and u ^ Q, then 

PA {{A-.gn Ca{Au) ^ 0}) < + Pa. 

Lemma 3 ([12, Lemmu 5]): If T ^ 0, then 

PABCM {{{A, B,c,m):Tn Cab{c, m) = 0}) < aAs - I + I'^^W^^^Wab + 1] _ 



When (A, Pa) and {B,pb) are the ensembles of Ia x n and Ib x n linear matrices, respectively, we have 
the following lemma. 

Lemma 4 ([12, Lemma 7]): The joint distribution {A x B,pab) has an (aAB,/3As)"hash property for the 
ensemble of functions A® B :U"' ^ i(Ia+Ib defined as 

A®B(u) = {Au,Bu), 

where 

oiAB{n) = aA{n)aB{n) (3) 

(iABin) = mm{(3Ain),PBin)}. (4) 
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IV. Fixed-rate Lossless Universal Source Coding 

In this section, we consider the fixed-rate lossless universal source coding illustrated in Fig. 1. 
For a given encoding rate R, Ia is given by 

We fix a function 

which is available to construct an encoder and a decoder. We define the encoder and the decoder (iUusfrated 
in Fig. 3) 

(fx -.X"- ^ X^^ 

as 



(p{x) = Ax 
</5"^(c) = gA{c), 

where 

5a(c) = arg min H{x'). 
x'eCA{c) 

The error probabihty Errorx {A) is given by 

Errorx(A) = nx {{x : (p~^{ip{x)) ^ x)) . 

We have the following theorem. It should be noted that the alphabet X may not be binary. 
Theorem 1: Assume that an ensemble {A,pa) has an (aA,/3^)-hash property. For a fixed rate R, 5 > Q 
and a sufficiently large n, there is a function (matrix) A G A such that 

Errorx(A) < max j^^J^, l| 2""^^- ^-^A-l + 0a (5) 

for any stationary memoryless sources X satisfying 

H{X) < R, (6) 

where 

Fx{R) = mm [D{,yu'\\fix) + \R- H{U')\+] 
and the infimum is taken over all X satisfying (6). Since 

inf FxiR) > 0, 

X:H{X)>R 

then the error probabihty goes to zero as n — > oo for all X satisfying (6). ■ 
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We can prove the coding theorem for a channel fiY\x with additive noise Z = Y — X hy letting A and 
Ca{0) = {x : Ax = 0} be a parity check matrix and a set of codewords (channel inputs), respectively. Then 
the encoding rate of this channel code is given by 

log|CA(0)| 



n 

and the error probability is given as 



> log\X\-R 



Errory|x(A) = ^ ^ fiY\x {{v ■ 9A{Ay) y - x} \ x) . 



Since 



z = y-x 
Az = Ay- Ax = Ay, 

then the decoding of channel input x from a syndrome Ay is equivalent to the decoding of source output z 
from its codeword Az by using qa- We have the following corollary. 

Corollary 2: Assume that an ensemble {A, pa) of linear functions has an (a^, /3^)-hash property. For a 
fixed rate R, S > and sufficiently large n, there is a (sparse) matrix A G A such that 

Error^i;,(A) < max | "^''^f , ll 2-"''"^ ^^(^)-^^^l + /3a 

for any stationary memoryless channel with additive noize Z satisfying 

\og\X\-R<I{X;Y)^log\X\-H{Z), (7) 

where the infimum is taken over all Z satisfying (7) and the error probability goes to zero as n — > oo for all 
X satisfying (7). ■ 
Remark 1: It should be noted here that the condition (H2) can be replaced by 

lim i^i^ = 0. (8) 

n — *oo ji 

By using the expurgation technique described in [1], we obtain an ensemble of sparce matrices that have an 
(Q:^,0)-hash property, where (H2) is replaced by (8). This implies that we can omit the term (5 a from the 
upper bound of the error probability. ■ 
Remark 2: Since a class of universal hash functions with a uniform distribution and an ensemble of all linear 
functions has a (l,0)-hash property, we obtain the same results as those reported in [7] and [3], respectively, 
where Fx represents the error exponent function. When {A, pa) is an ensemble of sparse matrices and (a^, /3^) 
is defined properly, we have the same result as that found in [8]. ■ 

V. Fixed-rate Universal Channel Coding 

The code for the channel coding problem (illustrated in Fig. 2) is given in the following (illustrated in Fig. 
4). The idea for the construction is drawn from [10] [12] [9]. We give the explicit construction of the encoder 
by using minimum-divergence encoding, which is not described in [10] [12] [9]. 
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Fig. 4. Construction of Channel Code 



For a given Ra, Rb > 0, let 

B : Af" ^ X^" 

satisfying 

log |IniA| 

= 



Rb 



n 

log |ImB| 



n 

respectively. 

We fix functions A, B and a vector c„ e A"'^ available to constract an encoder and a decoder. 
We define the encoder and the decoder 

if-.X^"^ Af" 

as 

(fi{m) = gAB{c,m) 
V~^{y) = BgA{c,y), 

where 

gAB (c, m) = arg min D{v^^\\iix) 

gA{c,y) = arg min H{x'\y). 
x'eCA{c) 
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The error probability Errory|x(^, B,c) is given by 

EriorY\x {A B,c) = ^PMim)fj,Y\x{y\'P{'m))x{'P~^{y) m), 

m,y 

where 

if c e \m.B 



PMim) = < 



|ImB| ' 





if c ^ ImB. 

It should be noted that ImB represents a set of all messages and Rb represents the encoding rate of a channel. 
We have the following theorem. 

Theorem 3: Assume that an ensemble {A,pa) (resp. {AxB,pab)) has an {cxa, /3^)-hash (resp. {cxab,I3ab)' 
hash) property. For a fixed rate Ra, -Rb > 0, a given input distribution fix satisfying 



H{X) >Ra + Rb, 



(9) 



S > 0, and a sufficiently large n, there are functions (matrices) A G A, B & B, and a vector c e ImA such 
that 



Errory|x(^, B, c) < a^s - 1 + ^^^^ + 2k 



for all /Ui'ijf satisfying 



where 



H{X\Y) < Ra, 



(10) 



(11) 



FY\xiR) = min[Dipy^u\\fiY\xWu) + \R- H{U\V)\+], 
the infimum is taken over all ij,y\x satisfying (9), and k = {K{n)}^^i is an arbitrary sequence satisfying 

lim K{n) = oo (12) 

n— *oo 

lim K{n)PAin) = (13) 
lim i^i^ = (14) 



and K denotes K{n). Since 



n— »cx> n 



inf Fyix{Ra) > 0, 



then the right hand side of (10) goes to zero as n ^ oo for all Hy\x satisfying (11). 
Remark 3: It should be noted here that we have 

IiX;Y)>RB 

from (11) and (9). However (11) and (15) do not imply (9) even when Ra < H{X). 
Remark 4: For satisfying (H3), there is k satisfying (12)-(14) by letting 

rfi if /3a (n) = o(n-«) 

otherwise 



(15) 



«;(n) = < 



(16) 
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for every n. If (}A{n) is not o (n there is k' > such that (}A{n)n^ > k' and 

n 2n 

log 4 
~ 2n 

^ log n — log k' 
~ 2n 

for all sufficiently large n. This implies that k satisfies (14). It should be noted that we can let ^ be arbitrarily 
large in (16) when /3a (^) vanishes exponentially fast. This parameter affects the upper bound of (10). ■ 
Remark 5: From Lemma 4, we have the fact that the condition (113) of (3^ is not necessary for the ensembles 
{A,pa) and {B,pb) of linear functions. ■ 



VI. Proof of Theorems 



In this section, we prove the theorems. 



A. Proof of Theorem 1 
Let 

Gu = {x' : H{x') < H{U)}. 
If X €:Tu and gA{Ax) ^ x, then there is x' £ Ca{Ax) such that x' x and 

H{x') < H{x) = H{U), 

which implies that 

[gu-{x}]nCA{Ax)j^iD. 

Then we have 



EA[Evrorx{A)]=EA 



^lix{x)x{gA{Ax) + x) 



< X) X) \ix{x)ipA I < 
u xeru 



A : 

[gu-{x}]nCA{Ax)^iD 
\Gu\aA 



< 2^ 2^ IJ-xix) max i — , 1 

U xeTu 



\lmA\ 



-Pa 



where the second inequality comes from Lemma 2, the third inequality comes from Lemma 8, the fourth 
inequality comes from Lemmas 6 and 7, and the last inequality comes from the definition of Fx and Lemma 
5. Then we have the fact that there is a function (matrix) A £ A satisfying (5). ■ 
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B. Proof of Theorem 3 

Let UV = vvu be a joint type of the sequence (a;, y) e x y^, where the marginal type U is defined as 

U = 3xgrmnD{vu'\\nx)- (17) 

and the conditional type given type U is denoted by V\U . Since Ra + Rb < H{X) and H{U) approaches 
H{X) as n goes to infinity because of the law of large numbers and the continuity of the entropy function, 
we have 

H{U) -\x> Ra + Rb + ^ 

n 

for all sufficiently large n. Then we have 

= K\lmA\\\mB\ 

for all sufficiently large n, where the first inequality comes from Lenmia 6. This implies that there is T c 7[/ 
such that 

\T\ 



K < 



|Im^||ImB| 



< 2k 



for all sufficiently large n. 
Let 



(18) 



• gAB{c,m) e T 
•9A{c,y) = gAB{c,m). 



Then we have 



where 



Error(A,B,c,My|x) < Pmy{Si) + Pmy{Si n S^), 



Si = {{m,y,w):(UCi)}. 
First, we evaluate Eabc [PMriSf)]. We have 

Eabc \pmy{S^)] = PABCM {{{A, B,c,m):T n Cab{c, m) = 0}) 



< aAB - 1 + 



^ ^ \lmA\\ImB\\j3AB + l] 



< aAB - 1 



13ab + 1 



(UCl) 
(UC2) 

(19) 



(20) 



where the equality comes from the property of T, the first inequality comes from Lemma 3 and the second 
inequaUty comes from (18). 

Next, we evaluate Eabc \pmy{Si H (Sj)]. Let 

g{y) = {x' : H{x'\y) < H{U\V)} 
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13 



and assume that {x,y) G Tuv- Then we have 

/ 

Eac [xi-^x = c)x{gA{c, y) ^ x)] = pac 



Pa\{A 



Ax = c 
{A, c) : 3x' ^x s.t. 

H{x'\y) < H{x\y) and Ax' = c 
3x' 7^ x s.t. 



H{x'\y) < H{x\y) and Ax' = Axj j 



PC ({c : Ax = c}) 



I I f 3x' ^x s.t. H{x'\y) < H{U\V) 1 ^ 
M 11 mdAx' = Ax 



I 

<nilT'^^'^l E PA ({^ : Ax = Ax'}) , 1 I 

' ' [a:'e[G(y)-{x}] ) 



niax{aA,l}2-"N«^— ^-i-^r-^yJ+ZJ^I , (21) 

where | • I"*" is defined by (2), the third equality comes from Lenrnia 1 and the second inequality comes from 
Lenrnia 8 and (H4) for an ensemble pa- Then we have 

Eabc \pMY{SinS^)] 



Eabcm 



< Eabcm 



X! X! X! IJ'Y\x{y\x)x{9AB{c,m) = x)x{gA{c,y)^x) 



X! X! m IJ'Y\x{y\x)x{Ax = c)x{Bx = m)x{gA{c,y)^x) 

xeTV\UyeTviu{^) 



^J2Y1 IJ-Y\x{y\x)EAc [x{Ax = c)x{gA{c, y) ^ x)] Ebm [x{Bx = m)] 

xeT v\u yerv\u(x) 



IITttiKI ^ 



|Im^||ImB| 



< I y 



E /^v|;.(y|a=)max{a^,l}2-"[l«--^(^l^)l"-^--l 
v\uyery\u{x) 



v\u 



< 



\T\ 



|Im^||ImB| 



< 2k 



max{aA, 1} 2-"[^^i^(^-^)-2^^yI + 



(22) 



where the second inequality comes from Lemma 1 and (21), the third inequality comes from Lemmas 7 and 
6, the fourth inequality comes from the definition of Fy\x and Lemma 5 and the last inequality comes from 
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(18). 

From (19), (20), and (22) we have 

Eabc [Etvotyix{A,B,c)] < a^B - 1 + ^^^^ + 2k \ma^{aA, 1} 2-"[^^i^(«^)-2^^^1 + 

Applying the above argument for all ij,y\x satisfying (11) and (9), we have the fact that there are A e A 
B G B, and c G ImA that satisfy (10). ■ 

VII. Conclusion 

The fixed rate universal coding theorems are proved by using the notion of hash property. We proved 
the theorems of fixed-rate lossless universal source coding and fixed-rate universal channel coding. Since an 
ensemble of sparse matrices satisfies the hash property requirement, it is proved that we can construct universal 
codes by using sparse matrices. 

Appendix 

We introduce the following lennmas that are used in the proofs of the theorems. 

Lemma 5 ([4, Lemma 2.2]): The number of different types of sequences in Af" is fewer than [n-|- 1]!"^!. The 
number of conditional types of sequences X x y h fewer than [n -|- Ijl-^H^I. ■ 
Lemma 6 ([4, Lemma 2.3]): For a type U of a sequence in Af", 



where Xx is defined in (1). 
Lemma 7 ([4, Lemma 2.6]): 



- log = H{r^^) + D{i^^\\lix) 



■ 

Lemma 8 ([11, Lemma 2]): For y e TV, 

I {x' : Hix') < H{U)} I < T'^Hiu)+\^] 
I {x' : H{x'\y) < H{U\V)} \ < 2»[^(c^l^)+^^yl, 

where Xx and Xxy are defined by (1). ■ 
Proof: The first inequality of this lemma is shown by the second inequality. The second inequality is 
shown by 

I {x' : H{x'\y) < H{U\V)} | = ^ 

U': 

H{U'\V)<H{U\V) 



< ^ 2nH{U'\V) 



U': 

H(U'\V)<H{U\V) 
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15 



< ^ 2nH{U\V) 

U': 

H{U'\V)<H[U\V) 
^ 2n[H{U\V)+\uv] ^ 

where the first inequahty comes from Lemma 6 and the third inequality comes from Lemma 5. ■ 
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